智能论文笔记

End-to-end Wind Turbine Wake Modelling with Deep Graph Representation Learning

Siyi Li , Mingrui Zhang , Matthew D. Piggott

分类：机器学习

2022-11-24

Wind turbine wake modelling is of crucial importance to accurate resource assessment, to layout optimisation, and to the operational control of wind farms. This work proposes a surrogate model for the representation of wind turbine wakes based on a state-of-the-art graph representation learning method termed a graph neural network. The proposed end-to-end deep learning model operates directly on unstructured meshes and has been validated against high-fidelity data, demonstrating its ability to rapidly make accurate 3D flow field predictions for various inlet conditions and turbine yaw angles. The specific graph neural network model employed here is shown to generalise well to unseen data and is less sensitive to over-smoothing compared to common graph neural networks. A case study based upon a real world wind farm further demonstrates the capability of the proposed approach to predict farm scale power generation. Moreover, the proposed graph neural network framework is flexible and highly generic and as formulated here can be applied to any steady state computational fluid dynamics simulations on unstructured meshes.

translated by 谷歌翻译

Confidence Propagation Cluster: Unleash Full Potential of Object Detectors

Yichun Shen* , Wanli Jiang* , Zhen Xu , Rundong Li , Junghyun Kwon , Siyi Li

分类：计算机视觉

2021-12-01

大多数物体检测方法通过使用非最大抑制（NMS）及其改进版本，如Soft-NMS获取对象，这是一个很长的历史记录，以删除冗余边界框。我们从三个方面挑战那些基于NMS的方法：1）具有最高置信度值的边界框可能不是具有与地面真理盒最大的重叠的真正积极。 2）冗余盒不仅需要抑制，而且对于那些真正的阳性也需要置信度。 3）不需要置信度值排序候选盒，以便可以实现完整的并行性。在本文中，通过信仰传播（BP）的启发，我们提出了置信沟集团（CP簇）来替换基于NMS的方法，这是完全并行化的，以及精度更好。在CP-Cluster中，我们借用BP的消息传递机制来惩罚冗余框，并以迭代方式同时增强真正的阳性直到收敛。我们通过将其应用于各种主流探测器，例如FasterRCNN，SSD，FCO，YOLOV3，YOLOV5，CENTERENET等实验，验证了CP-Cluster的有效性。在MS COCO上的实验表明，我们的插头和游戏方法没有再培训探测器，都能够稳步与基于NMS的方法相比，将分别从0.2到1.9的透明边距提高所有最先进模型的平均地图。源代码在https://github.com/shenyi0220/cp-cluster中获得

translated by 谷歌翻译

Going Beyond XAI: A Systematic Survey for Explanation-Guided Learning

Yuyang Gao , Siyi Gu , Junji Jiang , Sungsoo Ray Hong , Dazhou Yu , Liang Zhao

分类：人工智能 | 计算机视觉 | 机器学习

2022-12-07

As the societal impact of Deep Neural Networks (DNNs) grows, the goals for advancing DNNs become more complex and diverse, ranging from improving a conventional model accuracy metric to infusing advanced human virtues such as fairness, accountability, transparency (FaccT), and unbiasedness. Recently, techniques in Explainable Artificial Intelligence (XAI) are attracting considerable attention, and have tremendously helped Machine Learning (ML) engineers in understanding AI models. However, at the same time, we started to witness the emerging need beyond XAI among AI communities; based on the insights learned from XAI, how can we better empower ML engineers in steering their DNNs so that the model's reasonableness and performance can be improved as intended? This article provides a timely and extensive literature overview of the field Explanation-Guided Learning (EGL), a domain of techniques that steer the DNNs' reasoning process by adding regularization, supervision, or intervention on model explanations. In doing so, we first provide a formal definition of EGL and its general learning paradigm. Secondly, an overview of the key factors for EGL evaluation, as well as summarization and categorization of existing evaluation procedures and metrics for EGL are provided. Finally, the current and potential future application areas and directions of EGL are discussed, and an extensive experimental study is presented aiming at providing comprehensive comparative studies among existing EGL models in various popular application domains, such as Computer Vision (CV) and Natural Language Processing (NLP) domains.

translated by 谷歌翻译

Leveraging Large Language Models for Robot 3D Scene Understanding

William Chen , Siyi Hu , Rajat Talak , Luca Carlone

分类：机器人 | 自然语言处理 | 计算机视觉 | 机器学习

2022-09-12

语义3D场景理解是机器人技术至关重要的问题。尽管在空间感知方面已经取得了重大进展，但机器人仍然远非对普通人的家庭对象和位置具有常识性知识。因此，我们研究了大型语言模型来传授常识以进行场景理解。具体来说，我们介绍了三个范式，用于利用语言根据其包含的对象在室内环境中分类房间：（i）零摄像的方法，（ii）馈送前向分类器方法，以及（iii）对比分类器方法。这些方法在现代空间感知系统产生的3D场景图上运行。然后，我们分析了每种方法，证明了由于使用语言而引起的显着零拍概括和传递功能。最后，我们表明这些方法还适用于从包含房间中推断建筑标签，并在真实环境中演示我们的零弹方法。所有代码均可在https://github.com/mit-spark/llm_scene_understanding上找到。

translated by 谷歌翻译

FairDisCo: Fairer AI in Dermatology via Disentanglement Contrastive Learning

Siyi Du , Ben Hers , Nourhan Bayasi , Ghassan Hamarneh , Rafeef Garbi

分类：计算机视觉

2022-08-22

深度学习模型在自动化皮肤病变诊断方面取得了巨大成功。但是，在这些模型的预测中，种族差异通常不足以说明深色皮肤类型的病变，并且诊断准确性较低，因此受到很少的关注。在本文中，我们提出了Fairdisco，这是一个带有对比度学习的解开深度学习框架，它利用一个额外的网络分支来消除敏感属性，即从表示的表现形式中的皮肤型信息和另一个对比分支来增强特征提取。我们将Fairdisco与三种公平方法进行了比较，即重新采样，重新加权和属性 - 在两个新发布的具有不同皮肤类型的皮肤病变数据集上：Fitzpatrick17k和多样的皮肤病学图像（DDI）。我们为多个类别和敏感属性任务调整了两个基于公平的指标DPM和EOM，突出了皮肤病变分类中的皮肤型偏差。广泛的实验评估证明了Fairdisco的有效性，对皮肤病变分类任务的表现更公平，更出色。

translated by 谷歌翻译

RES: A Robust Framework for Guiding Visual Explanation

Yuyang Gao , Tong Steven Sun , Guangji Bai , Siyi Gu , Sungsoo Ray Hong , Liang Zhao

分类：计算机视觉

2022-06-27

尽管在现代深度神经网络（DNN）中的解释技术取得了快速的进步，其中主要重点是处理“如何产生解释”，但先进的研究问题，这些问题研究了解释本身的质量（例如，解释是否准确” ）并提高解释质量（例如，“如何调整模型以在解释不准确时生成更准确的解释”）仍然相对较小。为了指导该模型朝着更好的解释，解释监督的技术（在模型解释中增加了监督信号）已开始对提高深度神经网络的普遍性和内在解释性的影响显示出令人鼓舞的影响。然而，由于几个固有的挑战，有关监督解释的研究，特别是在通过显着图代表的基于视觉的应用中，正处于早期阶段：1）人类解释注释边界的不准确，2）人类解释注释区域的不完整， 3）人类注释和模型解释图之间的数据分布不一致。为了应对挑战，我们提出了一个通用的RES框架，用于通过开发一个新的目标来指导视觉解释，该目标可以处理人类注释不准确的边界，不完整的区域和不一致的分布，并具有对模型通用性的理论理由。在两个现实世界图像数据集上进行的广泛实验证明了该框架在增强解释的合理性和骨干DNNS模型的性能方面的有效性。

translated by 谷歌翻译

Extracting Zero-shot Common Sense from Large Language Models for Robot 3D Scene Understanding

William Chen , Siyi Hu , Rajat Talak , Luca Carlone

分类：机器人 | 自然语言处理

2022-06-09

语义3D场景理解是机器人技术至关重要的问题。尽管在同时定位和映射算法方面已经取得了重大进展，但机器人仍然远没有关于家庭对象及其普通人的位置的常识知识。我们介绍了一种新的方法，用于利用大语模型中嵌入的常识来标记室内包含的对象。该算法具有（i）不需要特定特定任务的预训练（完全在零拍摄方案中运行）和（ii）推广到任意房间和对象标签的额外好处在理解算法的机器人场景中，是非常理想的特征。所提出的算法在现代空间感知系统产生的3D场景图上运行，我们希望它将为机器人技术提供更概括和可扩展的高级3D场景理解铺平道路。

translated by 谷歌翻译

Optimal Variable Clustering for High-Dimensional Matrix Valued Data

Inbeom Lee , Siyi Deng , Yang Ning

分类： (统计)机器学习 | 机器学习

2021-12-24

矩阵值数据在许多应用中越来越普遍。这种类型数据的大多数现有的聚类方法都是针对均值模型定制的，并且不考虑特征的依赖结构，这可能非常有信息，尤其是在高维设置中。要从群集结构中提取信息以进行群集，我们提出了一种以矩阵形式排列的特征的新潜在变量模型，其中一些未知的隶属矩阵表示行和列的群集。在该模型下，我们进一步提出了一类使用加权协方差矩阵的差异作为异化测量的分层聚类算法。从理论上讲，我们表明，在温和条件下，我们的算法在高维设置中达到聚类一致性。虽然这种一致性结果为我们的算法具有广泛的加权协方差矩阵，但该结果的条件取决于重量的选择。为了调查重量如何影响我们算法的理论性能，我们在我们的潜在变量模型下建立了群集的最小限制。鉴于这些结果，我们在使用此权重的意义上识别最佳权重，保证我们的算法在某些集群分离度量的大小方面是最佳的最佳速率。还讨论了我们具有最佳权重的算法的实际实现。最后，我们进行仿真研究以评估我们算法的有限样本性能，并将该方法应用于基因组数据集。

translated by 谷歌翻译

Design Challenges for a Multi-Perspective Search Engine

Sihao Chen , Siyi Liu , Xander Uyttendaele , Yi Zhang , William Bruno , Dan Roth

分类：自然语言处理

2021-12-15

许多用户转向记录检索系统（例如搜索引擎）以寻求有争议的问题的答案。回答此类用户查询通常需要识别Web文档中的响应，并根据其不同的视角汇总响应。经典文档检索系统在为用户提供一系列直接和不同的响应时下降。当然，识别文档中的此类答复是一种自然语言理解任务。在本文中，我们研究了用文件检索综合这种语言理解目标的挑战，并研究了一个新的视角导向文档检索范式。我们讨论并评估内在的自然语言理解挑战，以实现目标。在设计挑战和原则之后，我们展示并评估了一个实用的原型管道系统。我们使用原型系统进行用户调查，以便评估我们的范例的效用，并理解用户信息需要有争议的查询。

translated by 谷歌翻译

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Junjie Yan , Yingfei Liu , Jianjian Sun , Fan Jia , Shuailin Li , Tiancai Wang , Xiangyu Zhang

分类：计算机视觉

2023-01-03

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.

translated by 谷歌翻译